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3STRA0BSTRACT 

Gauge invariance is the basis of the modern theory of electroweak and strong interactions 
(the so called Standard Model). The roots of gauge invariance go back to the year 1820 when 
electromagnetism was discovered and the first electrodynamic theory was proposed. 
Subsequent developments led to the discovery that different forms of the vector potential 
result in the same observable forces. The partial arbitrariness of the vector potential A 

brought forth various restrictions on it. W- dh= was proposed by J. C. Maxwell; 3,,A^ = 
was proposed L. V. Lorenz in the middle of 1860's . In most of the modern texts the latter 
condition is attributed to H. A. Lorentz, who half a century later was one of the key figures in 
the final formulation of classical electrodynamics. In 1926 a relativistic quantum-mechanical 
equation for charged spinless particles was formulated by E. SchrOdinger, O. Klein, and V. 
Fock. The latter discovered that this equation is invariant with respect to multiplication of the 
wave function by a phase factor exp(ie%/ftc ) with the accompanying additions to the scalar 
potential of -d%/cdt and to the vector potential of V%. M 1929 H. Weyl proclaimed this 
invariance as a general principle and called it Eichinvarianz in German and gauge invariance 
in English. The present era of non-abelian gauge theories started in 1954 with the paper by C. 
N. Yang and R. L. Mills. 
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I. INTROOtKODOETION 

The principle of gauge invariance plays a key role in the Standard Model which 
describes electroweak and strong interactions of elementary particles. Its origins can be 
traced to Vladimir Fock (1926b) who extended the known freedom of choosing the 
electromagnetic potentials in classical electrodynamics to the quantum mechanics of charged 
particles interacting with electromagnetic fields. Equations (5) and (9) of Fock's paper are, in 
his notation, 

PA =M, + V/ 

0=^-1^ , [Fock's (5)] 



and 



P =Pi- tL c f 



\\f = \|/ e 2 ™Pl h . [Fock's (9) ] 

In present day notation we write 

A^M=A + Vx , (la) 
O — > O' = O - - — . (lb) 

\|/ -» \|/' = \|/ exp(ie %/Rc ) • 

Here A is the vActor potential, O is the scalar potential, and % is known as the gauge 
function. The Maxwell equations of classical electromagnetism for the electric and magnetic 
fields are invariant under the transformations (la,b) of the potentials. What Fock discovered 
was that, for the quantum dynamics, that is, the form of the quantum equation, to remain 
unchanged by these transformations, the wave function is required to undergo the 
transformation (lc), whereby it is multiplied by a local (space-time dependent) phase. The 
concept was declared a general principle and "consecrated" by Hermann Weyl ( 1928, 1929a, 
1929b). The invariance of a theory under combined transformations such as (l,a,b,c) is 
known as a gauge invariance or a gauge symmetry and is a touchstone in the creation of 
modern gauge theories. 

The gauge symmetry of Quantum Electrodynamics (QED) is an abelian one, described 
by the U(l) group. The first attempt to apply a non-abelian gauge symmetry SU(2) x SU(1) to 
electromagnetic and weak interactions was made by Oscar Klein (1938). But this prophetic 
paper was forgotten by the physics community and never cited by the author himself. 

The proliferation of gauge theories in the second half of the 20th century began with 
the 1954 paper on non-abelian gauge symmetries by Chen-Ning Yang and Robert L. Mills 
(1954). The creation of a non-abelian electroweak theory by Glashow, Salam, and Weinberg 
in the 1960s was an important step forward, as were the technical developments by 't Hooft 
and Veltman concerning dimensional regularization and renormalization. The discoveries at 
CERN of the heavy W and Z bosons in 1983 established the essential correctness of the 
electroweak theory. The very extensive and detailed measurements in high-energy electron- 
positron collisions at CERN and at SLAC, and in proton-antiproton collisions at Fermilab and 
in other experiments have brilliantly verified the electroweak theory and determined its 
parameters with precision. 
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In the 1970's a non-abelian gauge theory of strong interaction of quarks and gluons 
was created. One of its creators, Murray Gell-Mann, gave it the name Quantum 
Chromodynamics (QCD). QCD is based on the SU(3) group, as each quark of given "flavor" 
(u, d, s, c, b, t) exists in three varieties or different "colors" (red, yellow, blue). The quark 
colors are analogues of electric charge in electrodynamics. Eight colored gluons are analogues 
of the photon. Colored quarks and gluons are confined within numerous colorless hadrons. 

QCD and Electroweak Theory form what is called today the Standard Model, which is 

the basis of all of physics except for gravity. All experimental attempts to falsify the Standard 

Model have failed up to now. But one of the cornerstones of the Standard Model still awaits 

its experimental test. The search for the so-called Higgs boson (or simply, higgs) or its 

equivalent is of profound importance in particle physics today. In the Standard Model , this 

electrically neutral, spinless particle is intimately connected with the mechanism by which 

quarks, leptons and W-, Z-bosons acquire their masses. The mass of the higgs itself is not 

restricted by the Standard Model, but general theoretical arguments imply that the physics 

2 

will be different from expected if its mass is greater than 1 TeV/c . Indirect indications from 

2 

LEP experimental data imply a much lower mass, perhaps 100 GeV/c , but so far there is no 
direct evidence for the higgs. Discovery and study of the higgs was a top priority for the 
aborted Superconducting Super Collider (SSC). Now it is a top priority for the Large Hadron 
Collider (LHC) under construction at CERN. 

The key role of gauge invariance in modern physics makes it desirable to trace its 
historical roots, back to the beginning of 19th century. This is the main aim of our review. 
In section II, the central part of our article, we describe the history of classical 
electrodynamics with a special emphasis on the freedom of choice of potentials A and4I> 
expressed in equations (la) and (lb). 

It took almost a century to formulate this non-uniqueness of potentials that exists 
despite the uniqueness of the electromagnetic fields. The electrostatic potential resulting 
from a distribution of charges was intimately associated with the electrostatic potential 
energy of those charges and had only the trivial arbitrariness of the addition of a constant. 
The invention of Leyden jars and the development of voltaic piles led to study of the flow of 
electricity and in 1820 magnetism and electricity were brought together by Oersted's 
discovery of the influence of a nearby current flow on a magnetic needle (Jelved, Jackson, and 
Knudsen, 1998). Ampere and others rapidly explored the new phenomenon, generalized it to 
current-current interactions, and developed a mathematical description of the forces between 
closed circuits carrying steady currents (Ampere, 1827). In 1831 Faraday made the discovery 
that a time varying magnetic flux through a circuit induces current flow (Faraday, 1839). 
Electricity and magnetism were truly united. 

The lack of uniqueness of the scalar and vector potentials arose initially in the desire 
of Ampere and others to reduce the description of the forces between actual current loops to 
differential expressions giving the element of force between infinitesimal current elements, 
one in each loop. Upon integration over the current flow in each loop, the total force would 
result. Ampere believed the element of force was central, that is, acting along the line joining 
the two elements of current, but others wrote down different expressions leading to the same 
integrated result. In the 1840's the work of Neumann (1847, 1849) and Weber (1878, 1848) 
led to competing differential expressions for the elemental force, Faraday's induction, and the 
energy between current elements, expressed in terms of different forms for the vector 
potential A. Ovec420 years later Helmholtz (1870) ended the controversy by showing that 
Neumann's and Weber's forms for JAwere physically equivalent. Helmholtz's linear 
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combination of the two forms with an arbitrary coefficient is the first example of what we now 
call a restricted class of different gauges for the vector potential. 

The beginning of the last third of the 19th century saw Maxwell's masterly creation of 
the correct complete set of equations governing electromagnetism, unfortunately expressed in 
a way that many found difficult to understand (Maxwell, 1865). Immediately after, the Danish 
physicist Ludvig V. Lorenz, apparently independently of Maxwell, brilliantly developed the 
same basic equations and conclusions about the kinship of light and the electromagnetism of 
charges and currents (Lorenz, 1867b). From the point of view of gauge invariance, Lorenz's 
contributions are most significant. He introduced the so-called retarded scalar and vector 
potentials and showed that they satisfied the relation almost universally known as the 
Lorentz condition, though he preceded the Dutch physicist H. A. Lorentz by more that 25 
years. 

By the turn of the century, thanks to, among others, Clausius, Heaviside, Hertz, and 
Lorentz, who invented what we now call microscopic electromagnetism, with localized 
charges in motion forming currents, the formal structure of electromagnetic theory, the role of 
the potentials, the interaction with charged particles, the concept of gauge transformations, 
not yet known by that name, were in place. Lorentz's encyclopedia articles (Lorentz, 1904a, 
b) and his book (Lorentz, 1909) established him as an authority in classical electrodynamics, 
to the exclusion of earlier contributors such as Lorenz. 

The start of the 20th century saw the beginning of the quantum, of special relativity, 
and of radioactive transformations. In the 1910s attention turned increasingly to atomic 
phenomena, with the confrontation between Bohr's early quantum theory and experiment. By 
the 1920s the inadequacies of the Bohr theory were apparent. In 1925-1926, Heisenberg, 
SchrOdinger, Born, and others invented quantum mechanics. Inevitably, when the interaction 
of charged particles with time varying electromagnetic fields came to be considered in 
quantum mechanics, the issue of the arbitrariness of the potentials would arise. What was 
not anticipated was how the consequences of a change in the electromagnetic potentials on 
the quantum mechanical wave function became transformed into a general principle that 
defines what we now call quantum gauge fields. 

In Section III we review the extension of the concept of gauge invariance in the early 
quantum era, the period from the end of the first World War to 1930, with emphasis on the 
annus mirabilis, 1926. As in Section II, we explore how and why priorities for certain 
concepts were taken from the originators and bestowed on others. We also retell the well- 
known story of the origin of the term "gauge transformation." In Section IV we discuss briefly 
the physical meaning of gauge invariance and describe the plethora of different gauges in 
sometime use today, but leave the detailed description of subsequent developments to 
others. 

In writing this article we mainly relied on original articles and books, but in some 
instances we used secondary sources (historical reviews and mongraphs). Among the many 
sources on the history of electromagnetism in the 19th century we mention Whittaker (1951), 
Reiff and Sommerfeld (1902), Hunt (1991), Darrigol (2000), Buchwald (1985, 1989, 1994), 
and Rosenfeld (1957). Volume 1 of Whittaker (1951) surveys all of classical electricity and 
magnetism. Reiff and Sommerfeld (1902) provide an early review of some facets of the 
subject from Coulomb to Clausius. As his title implies, Hunt (1991) focuses on the British 
developments from Maxwell to 1900, in particular, the works of George F. FitzGerald, Oliver 
Lodge, Oliver Heaviside, and Joseph Larmor, as Maxwell's theory evolved into the 
differential equations for the fields that we know today. Darrigol (2000) covers the 
development of theoretical and experimental electromagnetism in the 19th century, with 
emphasis on the contrast between Britain and the Continent in the interpretations of 
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Maxwell's theory. Buchwald (1985) describes in detail the transition in the last quarter of 
the 19th century from the macroscopic electromagnetic theory of Maxwell to the microscopic 
theory of Lorentz and others. Buchwald (1989) treats early theory and experiment in optics in 
the first part of the 19th century. Buchwald (1994) focuses on the experimental and 
theoretical work of Heinrich Hertz as he moved from Helmholtz's pupil to independent 
authority with a different world view. Rosenfeld's essay (Rosenfeld, 1957) focuses on the 
mathematical and philosophical development of electrodynamics from Weber to Hertz, with 
special emphasis on Lorenz and Maxwell. His comments on Lorenz's "modern" outlook are 
very similar to ours. None of these works stress the development of the idea of gauge 
invariance. 

The early history of quantum gauge theories as well as more recent developments 
have been extensively documented (Okun, 1986; Yang, 1986; Yang, 1987; O'Raifeartaigh, 
1997; O'Raifeartaigh and Straumann, 2000). 

Ludvig Valentin Lorenz of the classical era and Vladimir Aleksandrovich Fock emerge 
as physicists given less than their due by history. The many accomplishments of Lorenz in 
electromagnetism and optics are summarized by Kragh (1991, 1992) and more generally by 
Pihl (1939, 1972). Fock's pioneering researches have been described recently, on the 
occasion of the 100th anniversary of his birth (Novozhilov and Novozhilov, 1999, 2000; 
Prokhorov, 2000). 

The word "gauge" was not used in English for transformations such as (l,a,b,c) until 
1929 (Weyl, 1929a). It is convenient, nevertheless, to use the modern terminology even 
when discussing the works of 19th century physicists. Similarly, we usually write equations 
in a consistent modern notation, using Gaussian units for electromagnetic quantities. 



II. OLAGRKSSILCEHAERA 

On 21 July 1820 Oersted announced to the world his amazing discovery that magnetic 
needles were deflected if an electric current flowed in a circuit nearby, the first evidence that 
electricity and magnetism were related (Jelved, Jackson, and Knudsen, 1998). Within weeks 
of the news being spread, experimenters everywhere were exploring, extending, and making 
quantitative Oersted's observations, nowhere more than in France. In the fall of 1820, Biot 
and Savart studied the force of a current-carrying long straight wire on magnetic poles and 
announced their famous law - that for a given current and pole strength, the force on a pole 
was perpendicular to the wire and to the radius vector, and fell off inversely as the 
perpendicular distance from the wire (Biot and Savart, 1820). On the basis of a calculation of 
Laplace for the straight wire and another experiment with a V-shaped wire, Biot abstracted 
the conclusion that the force on a pole exerted by an increment of the current of length ds was 

(a) proportional to the product of the pole strength, the current, the length of the segment, the 
square of the inverse distance r between the segment and the pole, and to the sine of the 
angle between the direction of the segment and the line joining the segment to the pole, and 

(b) directed perpendicular to the plane containing those lines. (Biot, 1824). We recognize this 
as the standard expression for an increment of magnetic field dB timesffe pole strength - see 
for example Eq.(5.4), p 175 of Jackson (1998). 

At the same time Ampere, in a brilliant series of demonstrations before the French 
Academy, showed, among other things, that small solenoids carrying current behaved in the 
Earth's magnetic field as did bar magnets, and began his extensive quantitative observations 
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of the forces between closed circuits carrying steady currents. These continued over several 
years; the papers were collected in a memoir in 1826 (Ampere, 1827). 

The different forms for the vector potential in classical electromagnetism arose from 
the competing versions of the elemental force between current elements abstracted from 
Ampere's extensive observations. These different versions arise because of the possibility of 
adding perfect differentials to the elemental force, expressions that integrate to zero around 
closed circuits or circuits extending to infinity. Consider the two closed circuits C and C 
carrying currents /and F, respectively, as shown in Fig. 1. Ampere believed that the force 
increment dF between differential directed current segments I ds and Fds' was a central 
force, that is, directed along the line between the segments. He wrote his elemental force law 
in compact form (Ampere, 1827, p. 302), 



dF = 4klf^dsds' = 

Vr ds ds' r 2 



2r 



d r dr dr 
ds 3s' ds 3s' 



ds ds' 



(2) 



where the constant k = 1/c in Gaussian units. The distance r is the magnitude of 

-JK'^vhere x and x'jafe the coordinates of ds * n da and ds's= n'afs'. In what follows 
we also use the unit vector f = r/ar . In vector notation and Gaussian units, Ampere's force 
reads 



dF =11- 



*i\Mnr-idn2nri]dsds' 



(3) 



It is interesting to note that Ampere has the equivalent of this expression at the bottom of p. 
253 in (Ampere, 1827) in terms of the cosines defined by the scalar products. He preferred, 
however, to suppress the cosines and express his result in terms of the derivatives of r with 
respect to ds and ds', as in (2). 

The first observation to make is that the abstracted increment of force dF has Eo 
physical meaning because it violates the continuity of charge and current. Currents cannot 
suddenly materialize, flow along the elements ds and ds' aud then disappear again. The 
expression is only an intermediate mathematical construct, perhaps useful, perhaps not, in 
finding actual forces between real circuits. The second observation is that the form widely 
used at present (see Eq.(5.8), p. 177 of Jackson (1998) for the integrated expression) , 



dF = 



n ' - IT 



n x (nhx n) ds dri = -f^— n'(r-n)- -nri(n n'|i ds ds' 

2,-2 c 2 r 2 ' 



(4) 



c L r 



was first written down independently in 1845 by Neumann (Eq.(2), p. 64 of Neumann, 1847) 
and Grassmann (1845)*. Although not how these authors arrived at it, one way to 



* A cogent discussion of Ampere's work, his running dispute with Biot, and also 
Grassmann's criticisms and alternative expression for the force, is given in the little book by 
Tricker (1965). Tricker also gives translations of portions of the papers by Oersted, Ampere, 
Biot and Savart, and Grassmann. 



understand its form is to recall that a charge q' in nonrelativistic motion with velocity v' v' 
(think of a quasi-free electron moving through the stationary positive ions in a conductor) 
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generates a magnetic field (Woe q'W'xca/r ). Through the Lorentz force law F &q(E '^Ev^W 

lc), this field produces a force on a similar charge q moving with velocity IF in a second 

conductor. Now replace qv and q Wwith Inda and I'n'ds'. With its non-central contribution, 

it does not agree with Ampere's, but the differences vanish for the total force between two 

closed circuits, the only meaningful thing. In fact, the first term in (4) contains a perfect 
2 

differential , m-f /r ds = 

- ds-V^l/r),Vvhich gives a zero contribution when integrated over the closed path Cin Fig. 1. 
If we ignore this part of dF, the risidue appears as a central force (!) between elements, 
although not the same as Ampere's central force. 

Faraday's discovery in 1831 of electromagnetic induction - relative motion of a magnet 
near a closed circuit induces a momentary flow of current - exposed the direct link between 
electric and magnetic fields (Faraday, 1839). The experimental basis of quasi-static 
electromagnetism was now established, although the differential forms of the basic laws were 
incomplete and Maxwell's completion of the description with the displacement current was 
still 34 years in the future. Research tended to continue on the behaviour of current-carrying 
circuits interacting with magnets or other circuits. While workers spoke of induced currents, 
use of Ohm's law made it clear that they had in mind induced electric fields along the circuit 
elements. 

Franz E. Neumann in 1845 and 1847 analyzed the process of electromagnetic 
induction in one circuit from the relative motion of nearby magnets and other circuits 
(Neumann, 1847, 1849). He is credited by later writers as having invented the vector 
potential, but his formulas are always for the induced current or its integral and so are 
products of quantities among which one can sense the vector potential or its time derivative 
lurking, without explicit display. In the latter parts of his papers, he adopts a different tack. 
As mentioned above, he expresses the elemental force between current elements in what 
amounts to (4). He then omits the perfect differential to arrive at an expression for the 
elemental force dF (the Jecond term in (4)) that is the negative gradient with respect to irof 
a magnetic potential energy dP. From (4) we see that dP and its double integral P (over the 
circuits in Fig. 1) are 

dp = nrfn± dsds , p= _ll 

C 2 r C 2 

C C" 

The double integral in (5) is the definition of the mutual inductance of the circuits C and C. 
The force on circuit C is now the negative gradient of P with respect to a suitable coordinate 
defining the position of C, with both circuits kept fixed in orientation. Neumann's P is the 
negative of the magnetic interaction energy W, defined nowadays as 

W = ijnMdsk, withAA'(x) * ^ds' , (6) 

c C" 

where A' A 'the vector potential of the current /' flowing in circuit C. For a general current 
density J(xJ $'fhis form of the vector potential is 



nn 



ds ds' 



(5) 
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c j dV F 



A N (x,4*^ | d-Y f JOJ'.X) (7) 

We have attached a subscript AT to A here to Associate it with Neumann's work, as did 
subsequent investigators, even though he never explicitly displayed (6) or (7). 

Independently and at roughly the same time as Neumann, in 1846 Wilhelm Weber 
presented a theory of electromagnetic induction, considering both relative motion and time- 
varying currents as sources of the electromotive force in the secondary circuit. To this end, he 
introduced a central force law between two charges e and e' in motion, consistent with 
Ampere's law for current-carrying circuits (Weber, 1878, 1848). Weber adopted the 
hypothesis that current flow in a wire consists of equal numbers of charges of both signs 
moving at the same speed, but in opposite directions, rather than the general view at the time 
that currents were caused by the flow of two electrical fluids. He thus needed a basic force 
law between charges to calculate forces between circuits. Parenthetically we note that this 
hypothesis, together with the convention that the current flow was measured in terms of the 
flow of only one sign of charge, led to the appearance of factors of two and four in peculiar 
places, causing confusion to the unwary. We write everything with modern conventions. 
Weber's central force law, admittedly ad hoc and incorrect as a force between charges in 
motion, is (Weber, 1878, p. 229) 



_ ee ee 
F = — + 



1 d 2 r 1 ldr\ 2 



(8) 



The first term is just Coulomb's law. The ingredients of the second part can be expressed 
explicitly as 

^*=f(w-v'); ^ =r-£a- a') + M(v- V ) 2 - (F-(wf v*)) 2 ] , (9) 
dt at 2 r 

where v an<$ a (vaa»'d a') are the velocity and acceleration of the charge e (e'). If we add up 
the forces between the charges ±e with velocities ±v in the one current and those ±e' with 
velocities ±v$'m the other, and identify 2evv= Inds and 2eVl/= I'nfds', we obtain Ampere's 
expression (3). 

If instead of the force between current elements, we consider the force on a charge at rest at 
M, the position of nds) due to the current element l'n'd&', we find from Weber's force law, 

dF = -n€-r triads' , (10) 
c 2 r dt 

where dl'/dt arises from the presence of the acceleration a'. Weber's analysis was more 
complicated than that just described because he treated relative motion and time variation of 
the inducing current simultaneously, but for circuits with no relative motion Weber (Weber, 
1848, p. 239) writes the induced electromotive force in the circuit, dE = dF-itfa in this form. 
Weber wrote only the component of the induced force or emf along the element ndsf but if we 
identify the induced electric field as E= -l£l/c) d(dA) /cfrAwith cL&the elemental vector 
potential, we find from (10) dA arfd its integral A over the inducing circuit C'to be 
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d£k = 



r 

cr 



ra n' ds ' 



AA = 




rrn 



ds 



(11) 



The generalization of this form of the vector potential for a current density J(xJ,Xj is 



A W (A, % = 




d 3 x' lrrJ(k'jtf) 



(12) 



As with Neumann, we attach a subscript W for Weber to this form of the vector potential 
even though he did not write (11) or (12) explicitly. 

B. VfitoFpotenl^Hl^hl^hhdfHdHkfiiilmholtz 

Gustav Kirchhoff was the first to write explicitly (in component form) the vector 
potential (12); he also wrote the components of the induced current density as the 
conductivity times the negative sum of the gradient of the scalar potential and the time 
derivative of the vector potential (Kirchhoff, 1857, p. 530). He attributed the second term in 
the sum to Weber; the expression (12) became known as the Kirchhoff-Weber form of the 
vector potential. Kirchhoff applied his formalism to analyze the telegraph and calculate 
inductances. 

We note in passing that Kirchhoff showed (contrary to what is implied by Rosenfeld, 
1957) that the Weber form of A and th4 associated scalar potential O satisfy the relation (in 
modern notation), V- A d<b$cdt , the first published relation between potentials in what we 
now know as a particular gauge (Kirchhoff, 1857, p.532-533). 

In an impressive, if repetitive, series of papers, Hermann von Helmholtz (1870, 1872, 
1873, 1874) criticized and clarified the earlier work of Neumann, Weber, and others. He 
criticized Weber's force equation for leading to unphysical behavior of charged bodies in some 
circumstances, but recognized that Weber's form of the magnetic energy had validity. 
Helmholtz compared the Neumann and Weber forms of the magnetic energy between current 

elements, dW = pIF ds ds'/c r , with p(Neumann) = naistid p(Weber) = mtrrd'rr,, and noted 



that they differ by a multiple of the perfect differential ds ds' (oTr/ds ds') = 
ds ds'(mrmA f TT- n mfyff . Thus either form leads to the same potential energy and force for 
closed circuits. Helmholtz then generalized the expressions of Weber and Neumann for the 
magnetic energy between current elements by writing a linear combination (Helmholtz, 1870, 
equation (1.), p. 76, but in modern notation), 



Obviously, this linear combination differs from either Weber's or Neumann's expressions by a 
multiple of the above perfect differential, and so is consistent with Ampere's observations. 
The equivalent linear combination of the vector potentials (7) and (12) is 

(ibid., equation (l a .), p. 76, in compressed notation) 





2c 2 r 



(13) 
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A a = i<l + cc) A N + ±A1 -a) A w . . (14) 

a= 1 gives the Neumann form; a = -1 gives Weber. Helmholtz's generalization exhibits a 
one -parameter class of potentials that is equivalent to a family of vector potentials of different 
gauges in Maxwell's electrodynamics. In fact, in equation (Id.) on p. 77, he writes the 
connection between his generalization and the Neumann form (7) as (in modern notation), 

A& = Ajv + (1 - °° V¥ , where T = - i | r J(x', f) d'x' . (15) 



"=- L cf 



2 

Helmholtz goes on to show that *F satisfies V *¥ = 2 30/c dt , where O is the instantaneous 
electrostatic potential, and that 0(x, i) and his vector potential A a (x, t) a*k related by {ibid., 
equation (3 a .), p. 80, in modern notation) 

VAA a =-a a ^ • (16) 
c dt 



This relation contains the connection found in 1857 by Kirchhoff (for a = -1) and formally the 
condition found in 1867 by Lorenz -see below - but Helmholtz's relation connects only the 
quasi-static potentials, while Lorenz's relation holds for the fully retarded potentials. 
Helmholtz is close to establishing the gauge invariance of electromagnetism, but treats only a 
restricted class of gauges and lacks the transformation of the scalar as well as the vector 
potential. 

Helmholtz remarks rather imprecisely that the choice of a = leads to Maxwell's 
theory. The resulting vector potential, 



A]y[(A,t£ - 



2c J 1 r 



+ ir.wn)\ d 3 x , t (17) 



can be identified with Maxwell only because, as (16) shows, it is the quasi-static vector 

potential found from the transverse current for V-A^^AO, Maxwell's preferred choice for A. A 

Maxwell never wrote down (17). It is relevant for finding an approximate Lagrangian for the 

2 

interaction of charged particles, correct to order l/c - see Section II.D. 

We see in the early history the attempts to extend Ampere's conclusions on the 
forces between current-carrying circuits to a comprehensive description of the interaction of 
currents largely within the framework of potential energy, in analogy with electrostatics. 
Competing descriptions stemmed from the arbitrariness associated with the postulated 
elemental interactions between current elements, an arbitrariness that vanished upon 
integration over closed circuits. These differences led to different but equivalent forms for the 
vector potential. The focus was on steady-state current flow or quasi-static behavior. 
Meanwhile, others were addressing the propagation of light and its possible connection with 
electricity, electric currents, and magnetism. That electricity was due to discrete charges and 
electric currents to discrete charges in motion was a minority view, with Weber a notable 
advocate. Gradually, those ideas gained credence and charged particle dynamics came under 
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study. Our story now turns to these developments and how the concept of different gauges 
was elaborated, and by whose hands. 

C. EledBW^MjdyfflmrtiysM^jMaldwMlrd^eaiHjifidrfflertz 

The vector potential played an important role in Maxwell's emerging formulation of 
electromagnetic theory (Bork, 1967; see also Everitt, 1975). He developed an analytic 
description of Faraday's intuitive idea that a conducting circuit in a magnetic field was in an 
"electro-tonic state," ready to respond with current flow if the magnetic flux linking it changed 
in time (Maxwell, 1856). He introduced a vector, "electro-tonic intensity," with vanishing 
divergence, whose curl is the magnetic field B or .Bquivalently whose line integral around the 
circuit is related by Stokes's theorem to the magnetic flux through the loop. Including both 
time-varying magnetic fields and motion of the circuit through an imhomogeneous field, 
Maxwell expressed the electromotive force (in our language) as cE= - dA/dt = A^A/dt + A 
( v- V) AA Maxwell contrasts his mathematical treatment (which he does "not think [that it] 
contains even the shadow of a true physical theory") with that of Weber, which he calls "a 
professedly physical theory of electro-dynamics, which is so elegant, so mathematical, and so 
entirely different from anything in this paper,...." (Maxwell, 1856, Sc. P., Vol. 1, p.207-208). 

Interestingly, in the introduction to (Maxwell, 1865), while praising Weber and Carl 
Neumann, he distances himself from them in avoiding charged particles as sources, velocity- 
dependent interactions, and action-at-a-distance, preferring the mechanism of excited bodies 
and the propagation of effects through the ether. Specifically he states (Maxwell, 1865, 5c. P. 
Vol. 1, p. 528): 

"We therefore have some reason to believe, from the phenomena of light and heat, 
that there is an aethereal medium filling space and permeating bodies, capable of being 
set in motion and of transmitting that motion from one point to another, and of 
transmitting that motion to gross matter so as to heat it and affect it in various ways." 
In this paper he again asserts his approach to the vector potential, now called 
"electromagnetic momentum," with its line integral around a circuit called the total 
electromagnetic momentum of the circuit. The central role of the vector potential in Maxwell's 
thinking is evidenced in the table (op. tit., p. 561) where in his list of the 20 variable 
quantities in his equations (F, G, H) , the components of "Electromagnetic Momentum," top 
the list. A similar list in his treatise (Maxwell, 1873, Vol. 2, Art. 618, p.236) has the 
electromagnetic momentum second, after the coordinates of a point. In (Maxwell, 1865, Sc. 
P., Vol. 1, p. 564), he explains the use of the term "electomagnetic momentum" as a result of 
the analogy of the mechanical F= Bp/dq&nd the electromagnetic cE= EdA/dt, Aut cautions 
that it is "to be considered as illustrative, not as explanatory." * A curiosity is Maxwell's 
use in his treatise (Maxwell, 1873) of at least three different expressions for the same 



*The aptness of the term electomagnetic momentum goes beyond Maxwell's analogy; in the 
Hamiltonian dynamics of a charged particle the canonical momentum is P &p p eA/cA 



quantity - vector potential (Sects. 405, 590, 617), electrokinetic momentum (Sects. 579, 590), 
electromagnetic momentum (Sects. 604, 618 ). 

Helmholtz's identification of (17) with Maxwell is because Maxwell preferred WA\=0 
when using any vector potential (Maxwell, 1865, Sect. 98, p. 581; Maxwell, 1873, 1st ed., 
Sects. 616, 617, p. 235-236; 3rd ed., p. 256). In (Maxwell, 1873) he writes the vector 
potential A'iA'the Neumann form (7), but with the "total current," conduction /plus 
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displacement dD/cdt, instead of / alofte. [We transcribe his notation into present day notation 
where appropriate.] He then writes what is now called the gauge transformation equation A 
=A' -&x (Maxwell, 1873, equation (7), Sect. 616, p. 235, 1st ed., p.256, 3rd ed.) and 
observes: 

"The quantity % disappears from the equations (A) [H$= VtfoA] and itAs not related 

to any physical phenomenon." 
He goes on to say that he will set % = 0, remove the prime from A' an(Mave it as the true 
value of the vector potential. The virtue to Maxwell of his A is thA 

"it is the vector-potential of the electric current, standing in the same relation to the 

electric current that the scalar potential stands to the matter of which it is the 

potential." 

Maxwell's statement A\==AA - V% and Ine invariance of the fields under this (gauge) 
transformation is one of the earliest explicit statements, more general than Helmholtz's, but 
he misses stating the accompanying transformation of the scalar potential because of his use 
of the "total current" as the source of the vector potential. In the quasi-static limit, the 
elimination of the displacement current in vacuum in favor of the potentials and their sources 
leads to (17), the form Helmholtz identified with Maxwell. 

The Danish physicist Ludvig Valentin Lorenz is perhaps best known for his pairing 
with the more famous Dutch physicist Hendrik Antoon Lorentz in the Lorenz-Lorentz relation 
between index of refraction and density. In fact he was a pioneer in the theory of light and in 
electrodynamics, contemporaneous with Maxwell. In 1862 he developed a mathematical 
theory of light, using the basic known facts (transversality of vibrations, Fresnel's laws), but 
avoiding the (unnecessary, to him) physical modeling of a mechanistic ether* with bizarre 



* Notable in this regard, but somewhat peripheral to our history of gauge invariance, was 
James MacCullagh's early development of a phenomenological theory of light as disturbances 
propagating in a novel form of the elastic ether, with the potential energy depending not on 
compression and distortion but only on local rotation of the medium in order to make the light 
vibrations purely transverse (MacCullagh, 1839; Whittaker, 1951, p. 141-4; Buchwald, 1985, 
Appendix 2). MacCullagh's equations correspond (when interpreted properly) to Maxwell's 
equations for free fields in anisotropic media. We thank John P. Ralston for making available 
his unpublished manuscript on MacCullagh's work. 



properties in favor of a purely phenomenological model (Lorenz, 1863). Indeed, in a Danish 
publication (Lorenz, 1867a) he took a very modern sounding position on the luminiferous 
ether, saying, 

"The assumption of an ether would be unreasonable because it is a new non- 
substantial medium which has been thought of only because light was conceived in the 
same manner as sound and hence had to be a medium of exceedingly large elasticity 

and small density to explain the large velocity of light It is most unscientific to 

invent a new substance when its existence is not revealed in a much more definite 
way." (translation taken from Kragh, 1991, p. 4690). 

That same year, two years after Maxwell (1865) but evidently independently, he 
published a paper entitled "On the Identity of the Vibrations of Light with Electric Currents," 
(Lorenz, 1867b). On p. 287, addressing the issue of the disparities between the nature of 
electricity (two fluids), light (vibrations of the ether), and heat (motion of molecules) half a 
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century after Oersted's discoveries, he laments the absence of a unity of forces. He 
continues, 

"Hence it would probably be best to admit that in the present state of science we can 
form no conception of the physical reason of forces and of their working in the interior 
of bodies; and therefore (at present, at all events) we must choose another way, free 
from all physical hypotheses, in order, if possible, to develope (sic) theory step by 
step in such a manner that the further progress of a future time will not nullify the 
results obtained." 

Avoiding the distasteful ether, Lorenz follows Kirchhoff in attributing a conductivity to 
material media, and also a negligibly small but not zero conductivity for "empty" space. He 
thus deals with current densities rather than electric fields, which he defines according to 
Ohm's law (/= (S£j); miay of his equations are the customary ones when divided by the 
conductivity c. After stating Kirchhoff s version of the static potentials, in which the vector 
potential is Weber's form (12), he observes that retardation is necessary to account for the 
finite speed of propagation of light and, he supposes, electromagnetic disturbances in general. 
He generalizes the static scalar and vector potentials to the familiar expressions, often 
attributed to Lorentz, by introduction of Q ( Lorenz, 1867b, p. 289, [Phil. Mag.]) as his scalar 
potential and a, |3, y( ibid., p. 291) as the components of his vector potential. In modern 
notation these are 

0(x,t) = ( P(*'^-*/c) d 3 x . . AcA(x,t) = \ ( J(X '' \- r/c) dV . (18) 



the latter being the retarded form of the Neumann version (7). After showing that all known 
facts of electricity and magnetism (at that time all quasi- static) are consistent with the 
retarded potentials as much as with the static forms, he proceeds to derive equations for the 
fields that are the Maxwell equations we know, with an Ohm's law contribution for the 
assumed conducting medium. He points out that these equations are equivalent to those of 
his 1862 paper on light and proceeds to discuss light propagation and attenuation in metals, in 
dielectrics, in empty space, and the absence of free charge within conductors. He also works 
backward from the differential equations to obtain the retarded solutions for the potentials 
and the electric field in terms of the potentials in order to establish completely the 
equivalence of his theories of light and electromagnetism. 

In the course of deriving his "Maxwell equations," Lorenz establishes that his 
retarded potentials are solutions of the wave equation and also must satisfy the condition, 
dQ/dt = -2(da/dx + dpVdy + dy/dz) (ibid., p. 294) or in modern notation and units, 

This equation, now almost universally called the "Lorentz condition," is seen to originate 
with Lorenz more than 25 years before Lorentz. In discussing the quasi-static limit, Lorenz 
remarks (p. 292) that the retarded potentials (in modern, corrected terms, the "Lorenz" 
gauge potentials) give the same fields as the instantaneous scalar potential and a vector 
potential that is "a mean between Weber's and Neumann's theories," namely, (17), 
appropriate for Maxwell's choice of V-j^A). Without explicit reference, Lorenz was 
apparently aware of and made use of what we call gauge transformations. 
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Lorenz's paper makes no reference to Maxwell. Indeed, he only cites himself, but by 
1868 Maxwell had read Lorenz's paper and in his Treatise, at the end of the chapter giving 
his electromagnetic theory of light, he mentions Lorenz's work as covering essentially the 
same ground (Maxwell, 1973, 1st ed., Note after Sect. 805, p. 398; 3rd ed., p. 449-450). 
Although Lorenz made a number of contributions to optics and electromagnetism during his 
career (Kragh, 1991, 1992), his pioneering papers were soon forgotten. A major contributing 
factor was surely Maxwell's objection to the retarded potentials of Riemann (1867) and 
Lorenz (1867): 

"From the assumptions of both these papers we may draw the conclusions, first, that 
action and reaction are not always equal and opposite, and second, that apparatus 
may be constructed to generate any amount of work from its resources." (Maxwell, 
1868, 5c. P., Vol. 2, p. 137). 
Given the sanctity of Newton's third law and the conservation of energy, and Maxwell's 
stature, such criticism would be devastating. It is ironic that the person who almost invented 
electromagnetic momentum and who showed that all electromagnetic effects propagate with 
the speed of light did not recognize that the momentum of the electromagnetic fields needed 
to be taken into account in Newton's third law. Lorenz died in 1891, inadequately recognized 
then or later. In fact, by 1900 his name had disappeared from the mainstream literature on 
electromagnetism. 

An interesting footnote on the Lorenz condition (19) is that in 1888, twenty years 
later, FitzGerald, trying to incorporate a finite speed of propagation into his mechanical 
"wheel and band" model of the ether, was bothered by the instantaneous character of 
Maxwell's scalar potential. His model would accommodate no such instantaneous behavior. 
Realizing that it was a consequence of V-A =V)ft\c proposed (19) and found the standard 
wave equation of propagation for both <Pand A. (Hunt, 1991, p. 115-118). 

The mistaken attribution of (19) to Lorentz was pointed out by O'Rahilly (1938), Van 
Bladel (1991), and others. That Lorenz, not Lorentz, was the father of the retarded potentials 
(18) was first pointed out by Whittaker (1951, p. 268) but he mistakenly states {ibid., p. 394) 
that Levi-Civita was the first to show (in 1897) that potentials defined by these integrals 
satisfy (19). Levi-Civita in fact does just what Lorenz did in 1867. Lorentz's own use of the 
Lorenz condition is discussed below. 

Heinrich Hertz is most famous for his experiments in the 1880s demonstrating the free 
propagation of electromagnetic waves (Hertz, 1892), but he is equally important for his 
theoretical viewpoint. In 1884, beginning with the quasi-static, instantaneous electric and 
magnetic vector potentials of Helmholtz et al, he developed an iteration scheme that led to 
wave equations for the potentials and to the Maxwell equations in free space for the fields 
(Hertz, 1896, Electric Waves, p.273-290). His iterative approach showed one path from the 
action-at-a-distance potentials to the dynamical Maxwell equations for the fields. Hertz 
{ibid., p. 286) states that both 

"Riemann in 1858 and Lorenz in 1867, with a view to associating optical and electrical 
phenomena with one another, postulated the same or quite similar laws for the 
propagation of the potentials. These investigators recognized that these laws involve 
the addition of new terms to the forces which actually occur in electromagnetics; and 
they justify this by pointing out that these new terms are too small to be 
experimentally observable. But we see that the addition of these terms is far from 
needing any apology. Indeed their absence would necessarily involve contradiction of 
principles which are quite generally accepted." 
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It seems that Hertz did not fully appreciate that, while Lorenz's path from potentials to field 
equations was different in detail from his, Lorenz accomplished the same result 17 years 
earlier. Lorenz was not apologizing, but justifying his adoption of the retarded potentials as 
the necessary generalization, still in agreement with the known facts of electricity and 
magnetism. They were his starting point for obtaining his form of the Maxwell equations. 

Six years later, Hertz (Hertz, 1892, p. 193-268) addressed electrodynamics for bodies 
at rest and in motion. He discussed various applications, with the fields always to the fore 
and the scalar and vector potentials secondary. In this endeavor he made common cause with 
Heaviside, to whom he gives prior credit (Hunt, 1991, p. 122-128). Both men believed the 
potentials were unnecessary and confusing. In calculations Hertz apparently avoided them at 
all costs; Heaviside used them sparingly (O'Hara and Pricha, 1987, p. 58, 62, 66-67). By using 
only the fields, Hertz avoided the issue of different forms of the potentials - his formalism was 
gauge invariant, by definition*. 



* Hertz did not avoid potentials entirely. His name is associated with the "polarization 
potentials" of radiation problems. 



We have already described Weber's force equation (8) for the interaction of charged 

particles. While it permitted Weber to deduce the correct force between closed current- 

2 

carrying circuits, it does not even remotely agree to order 1/c with the force between two 
charges in motion. It also implies inherently unphysical behavior, as shown by Helmholtz 
(1873). Weber's work was important nevertheless in its focus on charged particles instead of 
currents and its initiation of the Kirchhoff- Weber form of the vector potential. 

A significant variation on charged particle dynamics, closer to the truth than Weber's, 
was proposed by Rudolf Julius Emanuel Clausius (1877, 1880). Struck by Helmholtz's 
demonstration of the equivalence of Weber's and Neumann's expressions for the interaction 
of charges or current elements, Clausius chose to write Lagrange's equations with an 
interaction of two charged particles e and e' that amounts to an interaction Lagrangian of the 
form,* 



* Prior to the end of the 19th century in mechanics and the beginning of the 20th century in 
electrodynamics the compact notation of L for T - V was rarely used in writing Lagrange's 

equations. We use the modern notation L int as a convenient shorthand despite its absence in 
the papers cited. 



t . - ee' 

^mt — 

Generalized to one charge e interacting with many, treated as continuous charge and current 
densities (p, J), thisJ Lagrangian reads 



v,v v v 

-1 + — 

c2 



(20) 
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^ = e [ - 0(x. t) +xj, v-Avfe.X) 



(21) 



where O is the instantaneous Coulomb potential and A N is theAnstantaneous Neumann 
potential (7) with a time-dependent current. The interaction (20), inherent in Neumann's 
earlier work on currents, is a considerable step forward in the context of charged particle 

interactions, but its instantaneous action-at-a-distance structure means that it is not a true 

2 

description, even to order 1/c . (The force deduced from it has the correct magnetic field 
2 

coupling to order 1/c , but lacks some of the corresponding corrections to the electric field 
contribution.) 

In an impressive paper, Oliver Heaviside (1889) chose V A = W-iAo that the 

instantaneous Coulomb field is exact) and constructed the appropriate vector potential, (17) 

2 

for a point source, to give the velocity-dependent interaction correct to order 1/c (Heaviside, 
1889, p. 328, Eq.(8)). For two charges e and e' with velocities v andvv', respectively, his 
results are equivalent to the interaction Lagrangian, 



Heaviside also derived the magnetic part of the Lorentz force. His contributions, like 
Lorenz's, were largely ignored subsequently. Darwin (1920) derived (22) by another method 
with no reference to Heaviside and applied it to problems in the old quantum theory. See also 
Fock (1959). 

A different approach was developed by H. A. Lorentz (1892) as part of his 
comprehensive statement of what we now call the microscopic Maxwell theory, with charges 
at rest and in motion as the sole sources of electromagnetic fields. His chapter IV is devoted 
to the forces between charged particles. The development is summarized on p.451-2 by 
statement of the microscopic Maxwell equations and the Lorentz force equation, 
F= d[ E + \&xH}c ]. In using D'Alembert's principle to derive his equations, Lorentz 
employs the vector potential, but never states explicitly its form in terms of the sources. It is 
clear, however, that he has retardation in mind, on the one hand from his exhibition of the full 
Maxwell equations to determine the fields caused by his p and J = |Jv, and on the other by his 
words at the beginning of the chapter. He calls his reformulation (in translation from the 
French) 

"a fundamental law comparable to those of Weber and Clausius, while maintaining the 

consequences of Maxwell's principles." 
A few sentences later, he stresses that the action of one charged particle on another is 
propagated at the speed of light, a concept originated by Gauss in 1845, but largely ignored 
for nearly 50 years. 

Joseph Larmor (1900) used the principle of least action for the combined system of 
electromagnetic fields and charged particles to obtain both the Maxwell equations and the 
Lorentz force equation. Karl Schwarzschild, later renowned in astrophysics and general 
relativity, independently used the same technique to discuss the combined system of 
particles and fields (Schwarzschild, 1903). He was the first to write explicitly the familiar 




- 1 + 



2c 2 



1 



{ v v' Hvnw rv' ) 



(22) 
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Lagrangian L int describing the interaction of a charged particle e , with coordinate x and 
velocity v, wish retarded external electromagnetic potentials, 



Lint — e 



0(x,t)+x^vA(x,t)4 , (23) 



where O and A are the/potentials given by (18). 

It is curious that, to the best of the authors' knowledge, the issue of gauge invariance 
of this charged particle Lagrangian did not receive general consideration in print until 1941 in 
the text by Landau and Lifshitz (1941). See also Bergmann (1946). The proof is simple. 
Under the gauge transformation (la,b) the Lagrangian (23) is augmented by 

AL to =e[l^ + Iy.V X | = ^ , (24) 



a total time derivative and so makes no contribution to the equations of motion. Perhaps this 
observation is too obvious to warrant publication in other than textbooks. We note that, in 
deriving the approximate Lagrangian attributed to him, Darwin (1920) expands the retarded 
potentials for a charged particle, which involve r =rx(t) xx'(1xJ, in powers of {t'-t) = -r/c , with 
coefficients of the primed particle's velocity, acceleration, etc., to obtain a tentative 
Lagrangian and then adds a total time derivative to obtain (22). Fock (1959) makes the 
same expansion, but then explicitly makes a gauge transformation to arrive at (22). These 
equivalent procedures exploit the arbitrariness of (24). 

EfordrttzetthE ibk iMrkitaiglfidl godiaui b/ir jteu igraib <*jaii tigjcflswd o m 

Our focus here is on how H. A. Lorentz became identified as the originator of both the 
condition (19) between O and A and t\A retarded solutions (18). In chapter VI, Lorentz 
(1892) presents, without attribution, a theorem that the integral 

F(x, = -L \k s(x\ i = t -r/c) d 3 x' (25a) 
4nJ r 

is a solution of the inhomogeneous wave equation with s(xx,f) as a source term, 
1 d 2 F <-> 

1 -V 2 F = s(%, f) . (25b) 



c 2 dt 2 



He then uses such retarded solutions for time integrals of the vector potential in a discussion 
of dipole radiation. 

In fact, the theorem goes back to Riemann in 1858 and Lorenz in 1861 and perhaps 
others. Riemann apparently read his paper containing the theorem to the Gbttingen academy 
in 1858, but his death prevented publication, remedied only in Riemann (1867). In (Lorenz, 
1867b), Lorenz states (25a,b) and remarks that the demonstration is easy, giving as 
reference his paper on elastic waves (Lorenz, 1861). It seems clear that in 1861 Lorenz was 
unaware of Riemann's oral presentation. The posthumous publication of Riemann's note 
occurred simultaneously with and adjacent to Lorenz's 1867 paper in Annalen*. 
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* Riemann (1867) showed that retardation led to the quasi- static instantaneous interactions 
of Weber and Kirchhoff, much as done by Lorenz (1867b), and remarked on the connection 
between the velocity of propagation of light and the ratio of electrostatic and electromagnetic 
units. 



In (Lorentz, 1895, Sect. 32), Lorentz quotes the theorem (25a,b), citing (Lorentz, 
1892) for proof, and then in Sect. 33 writes the components of a vector field in the form 

equivalent to (18) with J = (fv. H® does not call his vector field (\|/ x , \|/ y , \|/ z ) the vector 
potential. Having obtained the wave equation for H whlHWxxJJas source term, he merely 
notes that if H is diiined as the curl of his vector field, it is sufficient that the field satisfy the 
wave equation with /asiource. We thus see Lorentz in 1895 explicitly exhibiting retarded 
solutions, but without the condition (19). 

In a festschrift volume in honor of the 25th anniversary of Lorentz' s doctorate, Emil 
Wiechert (1900) summarizes the history of the wave equation and its retarded solutions. He 
cites Riemann inl858, Poincare in 1891, Lorentz (1892, 1895), and Levi-Civita in 1897. No 
mention of Lorenz! In the same volume, des Coudres (1900) cites (Lorentz, 1892) for the 
theorem (25a,b) and calls the retarded solutions (18) "Lorentz' schen LOsungen." It is 
evident that by 1900 the physics community had attributed the retarded solutions for O and A 
to Lorentz, to the exclusion of others. 

Additional reasons for Lorentz being the reference point for modern classical 
electromagnetism are his magisterial encyclopedia articles (Lorentz, 1904a, 1904b), and his 
book (Lorentz, 1909). Here we find the first clear statement of the arbitrariness of the 
potentials under what we now call general gauge transformations. On p. 157 of (Lorentz, 
1904b), he first states that in order to have the potentials satisfy the ordinary wave 
equations they must be related by 

A V A = - 1 — • [Lorentz's (2)]. 

c at 

He then discusses the arbitrariness in the potentials, stating that other potentials Aq and Oq A 
may give the same fields, but not satisfy his constraint. He then states "every other 
admissible pair A and <M" can be related to the first pair via the transformations, 

A =A AV X , O = O + ix . (26) 

He then says that the scalar function % can be found so that A and O db satisfy [Lorentz's 
(2)] by solving the inhomogeneous wave equation, 

VV^X = V-Ao + AOq . (27) 

A reader might question whether Lorentz was here stating the general principle of 
what we term gauge invariance. He stated his constraint before his statement of the 
arbitrariness of the potentials and then immediately restricted % to a solution of (27). This 
doubt is removed in his book (Lorentz, 1909). There, in Note 5, he says, 
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"Understanding by Aq and <frA special values, we may represent other values that 
may as well be chosen by [our equation (26)] where % is some scalar function 
(emphasis added). We shall determine % by subjecting A and O4to the condition 
[Lorentz's (2)] which can always be fulfilled because it leads to the equation [our 
equation (27)] which can be satisfied by a proper choice of %." 
He then proceeds to the wave equations and the retarded solutions in Sect. 13 of the main 
text. Lorentz obviously preferred potentials satisfying his constraint to the exclusion of 
other choices, but he did recognize the general principle of gauge invariance in classical 
electromagnetism without putting stress on it. 

The dominance of Lorentz's publications as source documents is illustrated by their 
citation by G. A. Schott in his Adams Prize essay (Schott, 1912). On p. 4, Schott 
quotes (19) [his equation (IX)] and the wave equations for A and O. Ife then cites Lorentz's 
second Encyclopedia article (Lorentz, 1904b) and his book (Lorentz, 1909) for the retarded 
solutions (18) [his equations (X) and (XI)], which he later on the page calls "the Lorentz 
integrals." 

Lorentz's domination aside, the last third of the 19th century saw the fundamentals of 
electromagnetism almost completely clarified, with the ether soon to disappear. Scientists 
went about applying the subject with confidence. They did not focus on niceties such as the 
arbitrariness of the potentials, content to follow Lorentz in use of the retarded potentials 
(18). It was only with the advent of modern quantum field theory and the construction of the 
electroweak theory and quantum chromodynamics that the deep significance of gauge 
invariance emerged. 



III. nmwnMIWMIMTBE pm^TMMTHJM ERA 

. 1926 : Stehrtia^eg^flBflir^HClein, Fock 

The year 1926 saw the flood gates open. Quantum mechanics, or more precisely, wave 
mechanics, blossomed at the hands of Erwin SchrOdinger and many others. Among the myriad 
contributions, we focus only on those that relate to our story of the emergence of the principle 
of gauge invariance in quantum theory. The pace among this restricted set is frantic enough. 
[To document the pace, we augment the references for the papers in this era with submission 
and publication dates.] The thread we pursue is the relativistic wave equation for spinless 
charged particles, popularly known nowadays as the Klein-Gordon equation. The presence of 
both the scalar and vector potentials brought forth the discovery of the combined 
transformations (1 a,b,c) by Fock. 

The relativistic wave equation for a spinless particle with charge e interacting with 
electromagnetic fields is derived in current textbooks by first transforming the classical 

constraint equation for a particle of 4-momentum = (p , p) apd mass m , p^p^ = (mc) , by 
the substitution p^—> p^ - eA^/c, where A^ = (A® = O, A) is tbfc 4-vector electromagnetic 
potential. Here we use the metric g = 1, g 1J = - 8^- . Then a quantum mechanical operator 
acting on a wave function \|/ is constructed by the operator substitution, p^" — > i where 
^=d/dx = (d , -V). Explicitly, we have 



( f&P - eA^lc )(md n - eAyJc ) y = (mc) 2 \|/ 



(28) 
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Alternatively, we divide through by -R and write 

[(3^ + ie A^/ftc + ie A^/Rc ) + (mc/R) 2 ] y = . (29) 

Separation of the space and time dimensions and choice of a constant energy solution, 
\\f oc exp(-iEt/h), yields the relativistic version of the SchrOdinger equation, 

- R 2 c 2 V 2 \(/ +ieRc(a^A^)\|/ + 2ieRc A ViV + e 2 A A \|/ = [(£ - eO) 2 - (mc 2 ) 2 ] \|/ . (30) 

The second term on the left is absent if the Lorenz gauge condition = is chosen for the 

potentials. 

The first of SchrOdinger' s four papers (SchrOdinger, 1926a) was submitted on 27 
January and published on 13 March. It was devoted largely to the nonrelativistic time- 
independent wave equation and simple potential problems, but in Section 3 he mentions the 
results of his study of the "relativistic Kepler problem." (An English translation of 
SchrOdinger's 1926 papers can be found in (SchrOdinger, 1978)). According to his biographer 
(Moore, 1989, p. 194- 197), SchrOdinger derived the relativistic wave equation in November 
1925, began solving the problem of the hydrogen atom while on vacation at Christmas, and 
completed it in early January 1926. Disappointed that he had not obtained the Sommerfeld 
fine- structure formula, he did not publish his work, but focused initially on the nonrelativistic 
equation. Some months later, in Sect. 6 of his fourth paper (SchrOdinger, 1926b), he 
tentatively presented the relativistic equation in detail and discussed its application to the 
hydrogen atom and to the Zeeman effect. 

SchrOdinger was not the only person to consider the relativistic wave equation. In a 
private letter to Jordan dated 12 April 1926, Pauli used the relativistic connection between 
energy and momentum to derive a wave equation equivalent to (29) with a static potential, 
then specialized to the nonrelativistic SchrOdinger equation, and went on to his main purpose 
- to show the equivalence of matrix mechanics and wave mechanics (van der Waerden, 1973). 
In the published literature, Oskar Klein (1926) treated a five-dimensional relativistic 
formalism and explicitly exhibited the four-dimensional relativistic wave equation for fixed 
energy with a static scalar potential. He showed that the nonrelativistic limit was the time- 
independent SchrOdinger equation, but did not discuss any solutions. Before publication of 
Klein's paper, Fock (1926a) independently derived the relativistic wave equation from a 
variational principle and solved the relativistic Kepler problem. He observed that SchrOdinger 
had already commented on the solution in his first paper. In his paper Fock did not include 
the general electromagnetic interaction. SchrOdinger comments in the introduction 
("Abstract") to his collected papers (SchrOdinger, 1978), 

"V. Fock carried out the calculations quite independently in Leningrad, before my last 

paper [(SchrOdinger, 1926b)] was sent in, and also succeeded in deriving the 

relativistic equation from a variational principle. Zeitschrift filr Physik 338§ 242 (1926)." 

The discovery of the symmetry under gauge transformations (1 a,b,c) of the quantum 
mechanical system of a charged particle interacting with electromagnetic fields is due to Fock 
(1926b). His paper was submitted on 30 July 1926 and published on 2 October 1926. In it he 
first discussed the special-relativistic wave equation of his earlier paper with electromagnetic 
interactions and addressed the effect of the change in the potentials (1 a,b). He showed that 
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the equation is invariant under the change in the potentials provided the wave function is 
transformed according to (lc). He went on to treat a five-dimensional general -relativistic 
formalism, similar to but independent of Klein. In a note added in proof, Fock notes that 

"While this note was in proof, the beautiful work of Oskar Klein [published on 10 July] 

arrived in Leningrad," 
and that the principal results were identical. 

That fall others contributed. Kudar (1926) wrote the relativistic equations down in 
covariant notation, citing Klein (1926) and Fock (1926a). He remarked that his general 
equation reduced to Fock's for the Kepler problem with the appropriate choice of potentials. 
Walter Gordon (1926) discussed the Compton effect using the relativistic wave equation to 
describe the scattering of light by a charged particle. He referred to SchrOdinger's first three 
papers, but not the fourth (SchrOdinger, 1926b) in which SchrOdinger actually treats the 
relativistic equation. Gordon does not cite Klein or either of Fock's papers. 

The above paragraphs show the rapid pace of 1926, the occasional duplication, and the 
care taken by some, but not all, for proper acknowledgment of prior work by others. If we go 
chronologically by publication dates, the Klein-Gordon equation should be known as the 
Klein-Fock-SchrOdinger equation; if by notebooks and letters, SchrOdinger and Pauli could 
claim priority. Totally apart from the name attached to the relativistic wave equation, the 
important point in our story is Fock's paper on the gauge invariance, published on 2 October 
1926 (Fock, 1926b). 

The tale now proceeds to the enshrinement by Weyl of symmetry under gauge 
transformations as a guiding principle for the construction of a quantum theory of matter 
(electromagnetism and gravity). Along the way, we retell the well-known story of how the 
seemingly inappropriate word "gauge" came to be associated with the transformations 
(1 a,b,c) and today's generalizations. 

B. WeyWg^ic^iiitjetihwiuaiUKa baucbpsiocnrikciple 

Fritz London, in a short note in early 1927 (London, 1927a) and soon after in a longer 
paper (London, 1927b), proposed a quantum mechanical interpretation of Weyl's failed 
attempt to unify electromagnetism and gravitation (Weyl, 1919). This attempt was 
undertaken long before the discovery of quantum mechanics. London noticed that Weyl's 

principle of invariance of his theory under a scale change of the metric tensor —> 

^ exp X(x) , where X(x) is an arbitrary function of the space-time coordinates, was 
equivalent in quantum mechanics to the invariance of the wave equation under the 
transformations (1) provided X(x) was made imaginary. In his short note London cites Fock 
(1926b) but does not repeat the citation in his longer paper, although he does mention 
(without references) both Klein and Fock for the relativistic wave equation in five 
dimensions. 

To understand London's point we note first that Weyl's incremental change of length 
scale dH = £ (j) v dx V leads to a formal solution £ = £qQx\)X(x), where X(x) = J X v dx V ; the 

indefinite integral over the real "potential" V is path-dependent. If we return to the 
relativistic wave equation (29), we observe that a formal solution for a particle interacting 
with the electromagnetic potential A^" can be written in terms of the solution without 
interaction as 
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¥o , (31) 



where \|/q is the zero-field solution. [Recovery of (29) may be accomplished by "solving" for 

\|/q and requiring ^ g^\|/q + (mc/ti) \|/q = 0.] With the gauge transformation of the 4-vector 
potential 

A^ 1 — » A> = A^-3^ , (32) 

the difference in phase factors is obviously the integral of a perfect differential, -d^%dx^ . Up 
to a constant phase, the wave functions \|/ and \|/ are thus related by the phase 
transformation 

\|/' = exp(iex(x)/r\c) \|/ , (33) 

which is precisely Fock's (lc). London actually expressed his argument in terms of "scale 
change" £ = £q exp(ih(x)), where iX(x) is the quantity in the exponential in (31), and wrote 
\f/i = \|/()/% 

The "gradient invariance" of Fock became identified by London and then by Weyl with 
an analogue of Weyl's "eichinvarianz" (scale invariance), even though the former concerns a 
local phase change and the latter a coordinate scale change. In his famous book, 
"Gruppentheorie und Quantenmechanik" (Weyl, 1928), Weyl discusses the coupling of a 
relativistic charged particle with the electromagnetic field. He observes, without references, 
that the electromagnetic equations and the relativistic Schrbdinger equation (28) are invariant 
under the transformations (la,b,c). Weyl then states on p. 88 (in translation): 

"This 'principle of gauge invariance' is quite analogous to that previously set up by the 
author, on speculative grounds, in order to arrive at a unified theory of gravitation and 

electricity . But I now believe that this gauge invariance does not tie together 
electricity and gravitation, but rather electricity and matter in the manner described 
above." 

His note 22 refers to his own work, to Schrbdinger (1923), and to London (1927b). In the 
first (1928) edition, the next sentence reads (again in translation): 

"How gravitation according to general relativity must be incorporated is not certain at 

present." 

By the second (1931) edition, this sentence has disappeared, undoubtedly because he 
believed that his own work in the meantime (Weyl, 1929a, 1929b) had shown the connection. 
In fact, in the second edition a new section 6 appears in Chapter IV, in which Weyl elaborates 
on how the gauge transformation (lc) can only be fully understood in the context of general 
relativity. 

Weyl's 1928 book and his papers in 1929 demonstrate an evolving point of view 
unique to him. Presumably prompted by London's observation, he addressed the issue of 
gauge invariance in relativistic quantum mechanics, knowing on the one hand that the 
principle obviously applied to the electromagnetic fields and charged matter waves, and on 
the other hand wanting to establish contact with his 1919 "eichinvarianz." As we have just 
seen, in his 1928 book he presented the idea of gauge invariance in the unadorned version of 
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Fock, without the "benefit" of general relativity. But in the introduction of the first of his 1929 
papers on the electron and gravitation (Weyl, 1929a), he states the "principle of gauge 
invariance" (the first use of the words in English) very much as in his book, citing only it for 
authority. He then goes on to show that the conservation of electricity is a double 
consequence of gauge invariance (through the matter and the electromagnetic equations) and 
that 

"This new principle of gauge invariance, which may go by the same name, has the 
character of general relativity since it contains an arbitrary function X, and can 
certainly only be understood with reference to it." 
He elaborated on this point in (Weyl, 1929b): 

" In special relativity one must regard this gauge-factor as a constant because here 
we have only a single point-independent tetrad. Not so in general relativity; every 
point has its own tetrad and hence its own arbitrary gauge-factor: because by the 
removal of the rigid connection between tetrads at different points the gauge-factor 
necessarily becomes an arbitrary function of position." (translation taken from 
O'Raifeartaigh and Straumann, 2000, p. 7). 
Nevertheless, Weyl stated (Weyl, 1929a, p.332, below equation (8)), 

"If our view is correct, then the electromagnetic field is a necessary accompaniment of 
the matter wave field and not of gravitation." 
The last sentence of (Weyl, 1929b) contains almost the same words. His viewpoint about the 
need for general relativity can perhaps be understood in the sense that X must be an arbitrary 
function in the curved space-time of general relativity, but not necessarily in special relativity, 
and his desire to provide continuity with his earlier work. The close mathematical relation 
between non-abelian gauge fields and general relativity as connections in fiber bundles was 
not generally realized until much later (See e.g., Yang, 1986; O'Raifeartaigh and Straumann, 
2000). 

Historically, of course, Weyl's 1929 papers were a watershed. They enshrined as 
fundamental the modern principle of gauge invariance, in which the existence of the 4-vector 
potentials (and field strengths) follow from the requirement of the invariance of the matter 
equations under gauge transformations such as (lc) of the matter fields. This principle is the 
touchstone of the theory of gauge fields, so dominant in theoretical physics in the second half 
of the 20th century. The important developments beyond 1929 can be found in the reviews 
already mentioned in the Introduction. The reader should be warned, however, of a curiosity 
regarding the citation of Fock's 1926 paper (Fock, 1926b) by O'Raifeartaigh (1997), 
O'Raifeartaigh and Straumann (2000), and Yang (1986, 1987). While the volume and page 
number are given correctly, the year is invariably given as 1927. One of the writers privately 
blames it on Pauli (1933). Indeed, Pauli made that error, but he did give to Fock the priority 
of introducing gauge invariance in quantum theory. 
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While for Electroweak Theory and QCD gauge invariance is of paramount importance, 

its physical meaning in QED per se does not seem to be extremely profound. A tiny mass of 

2 2 

the photon would destroy the gauge invariance of QED, as a mass term m^ A in the 
Lagrangian is not gauge invariant. At the same time the excellent agreement of QED with 
experiment and in particular its renormalizability would not be impaired (see e.g., Kobzarev 
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and Okun, 1968; Goldhaber and Nieto, 1971). On the other hand the renormalizability would 
be destroyed by an anomalous magnetic moment term in the Lagrangian, |i\(/o^ v \)/i^' V , in 
spite of its manifest gauge invariance. What is really fundamental in electrodynamics is the 
conservation of electromagnetic current or in other words conservation of charge (see e.g., 

Okun, 1986, Lecture 1). Conservation of charge makes the effects caused by a possible 

2 

nonvanishing mass of the photon, , proportional to and therefore negligibly small for 
small enough values of . 

It should be stressed that the existing upper limits on the value of lead in the case 
of non-conserved current to such catastrophic bremsstrahlung, that most of the experiments 
which search for monochromatic photons in charge-nonconserving processes become 
irrelevant (Okun and Zeldovich, 1978). Further study has shown (Voloshin and Okun, 1978) 
that reabsorption of virtual bremsstrahlung photons restores the conservation of charge (for 
reviews, see Okun, 1989, 1992). 

As has been emphasized above, gauge invariance is a manifestation of non- 
observability of A^. However integrals such as in eq (31) are observable when they are 
taken over a closed path, as in the Aharonov-Bohm effect (Aharonov and Bohm, 1959). The 
loop integral of the vector potential there can be converted by Stokes's theorem into the 
magnetic flux through the loop, showing that the result is expressible in terms of the magnetic 
field, albeit in a nonlocal manner. It is a matter of choice whether one wishes to stress the 
field or the potential, but the local vector potential is not an observable. 

B. Ex£ifiipJtt»phi N sgji£igaHges 

The gauge invariance of classical field theory and of electrodynamics in particular allows one 
to consider the potential with various gauge conditions, most of them being not Poincare 



invariant: 

0^^ = 01 = 0,1,2,3), Lorenz gauge (34) 

V'j&sAdjAj = (j = 1, 2, 3) , Coulomb gauge or radiation gauge (35) 
m 2 

n^A = (n =0), light cone gauge (36) 

A Q = , Hamiltonian or temporal gauge (37) 

Aj = 0, axial gauge (38) 

X|jA^=0, Fock-Schwinger gauge (39) 

X;A; = , Poincare gauge (40) 
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An appropriate choice of gauge simplifies calculations. This is illustrated by many examples 
presented in textbooks, e.g., (Jackson, 1998). For the quantum mechanics of nonrelativistic 
charged particles interacting with radiation, the Coulomb gauge is particularly convenient 
because the instantaneous scalar potential describing the static interactions and binding is 
unquantized; only the transverse vector potential of the photons is quantized. However, 
noncovariant gauges, characterized by fixing a direction in Minkowski space, pose a number 
of problems discussed in Gaig, Kummer, and Schweda (1990). The problems acquire 
additional dimensions in quantum field theory where one has to deal with a space of states 
and with a set of operators. 

In QED the gauge degree of freedom has to be fixed before the theory is quantized. 

Usually the gauge fixing term (B^A^ 1 ) is added to the gauge invariant Lagrangian density 
with coefficient l/2a (For futher details see Gaig, Kummer, and Schweda, 1990; Berestetskii, 
Lifshitz, and Pitaevskii, 1971; Ramond, 1981; Zinn- Justin, 1993). In perturbation theory the 
propagator of a virtual photon with 4-momentum k acquires the form, 



k^kl 

k 2 r ' k 2 



D(kf v = - J- g» v + ( a - 1) 



(41) 



The most frequently used cases are 

a = 1 ( Feynman gauge), (42) 

a = (Landau gauge). (43) 

In the Feynman gauge the propagator (41) is simpler, while in the Landau gauge its 
longitudinal part vanishes, which is often more convenient. If calculations are carried out 
correctly, the final result will not contain the gauge parameter a. 

In the static (zero frequency) limit the propagator (41) reduces to 

Di/kJO) = ^Uij + (cc - D 1 ^) , (44) 
lkl 2 \ k Ikl 2 / 

D 00 (k,0)=k^- , (45) 

Ikl 2 

D /k,0k=D i0 (k,0k=0 . (46) 



This is the propagator for the Helmholtz potential (14) and the static Coulomb potential. 

Various gauges have been associated with names of physicists, a process begun by 
Heitler who introduced the term "Lorentz relation" in the first edition of his book (Heitler, 
1936). In the third edition (Heitler, 1954) he used "Lorentz gauge" and "Coulomb gauge." 
Zumino (1960) introduced the terms "Feynman gauge," "Landau gauge," and "Yennie 
gauge" (a = 3 in (41)). 
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What is now generally known as a gauge transformation of the electromagnetic 
potentials (la, b) was discovered in the process of formulation of classical electrodynamics 
by its creators, Lorenz, Maxwell, Helmholtz, and Lorentz, among others (1867-1909). The 
phase transformation (lc) of the quantum mechanical charged field accompanying the 
transformation of the electromagnetic potentials was discovered by Fock (1926b). The term 
"gauge" was applied to this transformation by Weyl £1928, 1929a, 1929b) (who used "eich-" 
a decade before to denote a scale transformation in his unsuccessful attempt to unify gravity 
and electromagnetism). 

In text books on classical electrodynamics the gauge invariance (la,b) was first 
discussed by Lorentz in his influential book, "Theory of Electrons" (Lorentz, 1909). The first 
derivation of the invariance of the Lagrangian for the combined system of electromagnetic 
fields and charged particles was presented by Landau and Lifshitz (1941) (with reference to 
Fock, they used Fock's term "gradient invariance"). 

The first model of a non-abelian gauge theory of weak, strong, and electromagnetic 
interactions was proposed by Klein (1938) (who did not use the term "gauge" and did not 
refer to Weyl). But this attempt was firmly forgotten. The modern era of gauge theories 
started with the paper by Yang and Mills (1954). 

The history of gauge invariance resembles a random walk, with the roles of some 
important early players strangely diminished with time. There is a kind of echo between the 
loss of interest by O. Klein and L. Lorenz in their "god blessed children". It is striking that the 
notion of gauge symmetry did not appear in the context of classical electrodynamics, but 
required the invention of quantum mechanics. It is amusing how little the authors of 
text-books know about the history of physics. For a further reading the Resource Letter 
(Cheng and Li, 1988) is recommended. 
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Figure 1. Two closed current carrying circuits Cand C with currents / and 1', respectively 
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